An Unsupervised Approach for Mapping between Vector Spaces
نویسندگان
چکیده
We present a language independent, unsupervised approach for transforming word embeddings from source language to target language using a transformation matrix. Our model handles the problem of data scarcity which is faced by many languages in the world and yields improved word embeddings for words in the target language by relying on transformed embeddings of words of the source language. We initially evaluate our approach via word similarity tasks on a similar language pair Hindi as source and Urdu as the target language, while we also evaluate our method on French and German as target languages and English as source language. Our approach improves the current state of the art results by 13% for French and 19% for German. For Urdu, we saw an increment of 16% over our initial baseline score. We further explore the prospects of our approach by applying it on multiple models of the same language and transferring words between the two models, thus solving the problem of missing words in a model. We evaluate this on word similarity and word analogy tasks.
منابع مشابه
s-Topological vector spaces
In this paper, we have dened and studied a generalized form of topological vector spaces called s-topological vector spaces. s-topological vector spaces are dened by using semi-open sets and semi-continuity in the sense of Levine. Along with other results, it is proved that every s-topological vector space is generalized homogeneous space. Every open subspace of an s-topological vector space is...
متن کاملUnsupervised Word Mapping Using Structural Similarities in Monolingual Embeddings
Most existing methods for automatic bilingual dictionary induction rely on prior alignments between the source and target languages, such as parallel corpora or seed dictionaries. For many language pairs, such supervised alignments are not readily available. We propose an unsupervised approach for learning a bilingual dictionary for a pair of languages given their independently-learned monoling...
متن کاملGroup-Induced Vector Spaces
The strength of classifier combination lies either in a suitable averaging over multiple experts/sources or in a beneficial integration of complementary approaches. In this paper we focus on the latter and propose the use of group-induced vector spaces (GIVSs) as a way to combine unsupervised learning with classification. In such an integrated approach, the data is first modelled by a number of...
متن کاملComplete lattice learning for multivariate mathematical morphology
The generalization of mathematical morphology to multivariate vector spaces is addressed in this paper. The proposed approach is fully unsupervised and consists in learning a complete lattice from an image as a nonlinear bijective mapping, interpreted in the form of a learned rank transformation together with an ordering of vectors. This unsupervised ordering of vectors relies on three steps: d...
متن کاملOn intermediate value theorem in ordered Banach spaces for noncompact and discontinuous mappings
In this paper, a vector version of the intermediate value theorem is established. The main theorem of this article can be considered as an improvement of the main results have been appeared in [textit{On fixed point theorems for monotone increasing vector valued mappings via scalarizing}, Positivity, 19 (2) (2015) 333-340] with containing the uniqueness, convergent of each iteration to the fixe...
متن کاملKernel Regression Mapping for Vocal Eeg Sonification
This paper introduces kernel regression mapping sonification (KRMS) for optimized mappings between data features and the parameter space of Parameter Mapping Sonification. Kernel regression allows to map data spaces to highdimensional parameter spaces such that specific locations in data space with pre-determined extent are represented by selected acoustic parameter vectors. Thereby, specifical...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1711.05680 شماره
صفحات -
تاریخ انتشار 2017